Everything Totally Explained


Ask & we'll explain, totally!
Context-free grammar
Totally Explained


  NEW! All the latest news in the worlds of computer gaming, entertainment, the environment,  
finance, health, politics, science, stocks & shares, technology and much, much, more.  


View this entry using RSS

Everything about Context-free Grammar totally explained

In formal language theory, a context-free grammar (CFG) is a grammar in which every production rule is of the form:Vw where V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals (possibly empty). The term "context-free" expresses the fact that nonterminals can be rewritten without regard to the context in which they occur. A formal language is context-free if some context-free grammar generates it.
   Context-free grammars play a central role in the description and design of programming languages and compilers. They are also used for analyzing the syntax of natural languages.

Background

Since the time of Pāṇini, at least, linguists have described the grammars of languages in terms of their block structure, and described how sentences are recursively built up from smaller phrases, and eventually individual words or word elements.
   The context-free grammar (or "phrase-structure grammar" in the mid-1950s, took the manner in which linguistics had described this grammatical structure, and then turned it into rigorous mathematics. A context-free grammar provides a simple and precise mechanism for describing the methods by which phrases in some natural language are built from smaller blocks, capturing the "block structure" of sentences in a natural way. Its simplicity makes the formalism amenable to rigorous mathematical study, but it comes at a price: important features of natural language syntax such as agreement and reference can't be expressed in a natural way, or not at all.
   Block structure was introduced into computer programming languages by the Algol project, which, as a consequence, also featured a context-free grammar to describe the resulting Algol syntax. This became a standard feature of computer languages, and the notation for grammars used in concrete descriptions of computer languages came to be known as Backus-Naur Form, after two members of the Algol language design committee.
   The "block structure" aspect that context-free grammars capture is so fundamental to grammar that the terms syntax and grammar are often identified with context-free grammar rules, especially in computer science. Formal constraints not captured by the grammar are then considered to be part of the "semantics" of the language.
   Context-free grammars are simple enough to allow the construction of efficient parsing algorithms which, for a given string, determine whether and how it can be generated from the grammar. An Earley parser is an example of such an algorithm, while the widely used LR and LL parsers are more efficient algorithms that deal only with more restrictive subsets of context-free grammars.

Formal definitions

A context-free grammar G is a 4-tuple:
G = (V,, Sigma,, R,, S,) where
   1. V, is a finite set of non-terminal characters or variables. They represent different types of phrase or clause in the sentence.
   2. Sigma, is a finite set of terminals, disjoint with V,, which make up the actual content of the sentence.
   3. R, is a relation from V, to (VcupSigma)^S indicates a substring recognized as belonging to S. This hierarchy can also be seen as a tree:
S /| / | / | S '+' S /| | / | | S '+' S 'a' | | '1' '1'
   This tree is called a concrete syntax tree (see also abstract syntax tree) of the string. In this case the presented leftmost and the rightmost derivations define the same syntax tree; however, there's another (leftmost) derivation of the same string » S → S + S (1)


      → 1 + S (2) »    → 1 + S + S (1)


      → 1 + 1 + S (2) »    → 1 + 1 + a (3)

and this defines the following syntax tree:
S /| / | / | S '+' S | /| | / | '1' S '+' S | | '1' 'a'
   If, for certain strings in the language of the grammar, there's more than one parsing tree, then the grammar is said to be an ambiguous grammar. Such grammars are usually hard to parse because the parser can't always decide which grammar rule it has to apply. Usually, ambiguity is a feature of the grammar, not the language, and an unambiguous grammar can be found which generates the same context-free language. However, there are certain languages which can only be generated by ambiguous grammars; such languages are called inherently ambiguous.

Normal forms

Every context-free grammar that doesn't generate the empty string can be transformed into one in which no rule has the empty string as a product [arule with ε as a product is called an ε-production]. If it does generate the empty string, it'll be necessary to include the rule S arr epsilon, but there need be no other ε-rule. Every context-free grammar with no ε-production has an equivalent grammar in Chomsky normal form or Greibach normal form. "Equivalent" here means that the two grammars generate the same language.
   Because of the especially simple form of production rules in Chomsky Normal Form grammars, this normal form has both theoretical and practical implications. For instance, given a context-free grammar, one can use the Chomsky Normal Form to construct a polynomial-time algorithm which decides whether a given string is in the language represented by that grammar or not (the CYK algorithm).

Undecidable problems

Although some operations on context-free grammars are decidable due to their limited power, CFGs do have interesting undecidable problems. One of the simplest and most cited is the problem of deciding whether a CFG accepts the language of all strings. A reduction can be demonstrated to this problem from the well-known undecidable problem of determining whether a Turing machine accepts a particular input. The reduction uses the concept of a computation history, a string describing an entire computation of a Turing machine. We can construct a CFG that generates all strings that are not accepting computation histories for a particular Turing machine on a particular input, and thus it'll accept all strings only if the machine doesn't accept that input.
   As a consequence of this, it's also undecidable whether two CFGs describe the same language, since we can't even decide whether a CFG is equivalent to the trivial CFG deciding the language of all strings.
   Another point worth mentioning is that the problem of determining if a context-sensitive grammar describes a context-free language is undecidable.

Extensions

An obvious way to extend the context-free grammar formalism is to allow nonterminals to have arguments, the values of which are passed along within the rules. This allows natural language features such as agreement and reference, and programming language analogs such as the correct use and definition of identifiers, to be expressed in a natural way. E.g. we can now easily express that in English sentences, the subject and verb must agree in number.
   In computer science, examples of this approach include affix grammars, attribute grammars, indexed grammars, and Van Wijngaarden two-level grammars.
   Similar extensions exist in linguistics.
   Another extension is to allow additional symbols to appear at the left hand side of rules, constraining their application. This produces the formalism of context-sensitive grammars.

Linguistic applications

Chomsky initially hoped to overcome the limitations of context-free grammars by adding transformation rules., although his specific examples regarding the inadequacy of context free grammars (CFGs) in terms of their weak generative capacity were later disproved. Gerald Gazdar and Geoffrey Pullum have argued that despite a few non-context-free constructions in natural language (such as cross-serial dependencies in Swiss German), the vast majority of forms in natural language are indeed context-free.Further Information

Get more info on 'Context-free Grammar'.


External Link Exchanges

Do you know how hard it is to get a link from a large encyclopaedia? Well we're different and will prove it. To get a link from us just add the following HTML to your site on a relevant page:

    <a href="http://context-free_grammar.totallyexplained.com">Context-free grammar Totally Explained</a>

Then simply click through this link from your web page. Our crawlers will verify your link, extract the title of your web page and instantly add a link back to it. If you like you can remove the words Totally Explained and embed the link in article text.
   As long as your link remains in place, we'll keep our link to you right here. Please play fair - our crawlers are watching. Your site must be closely related to this one's topic. Any kind of spamming, dubious practises or removing the link will result in your link from us being dropped and, potentially, your whole site being banned.



Copyright © 2007-8 totallyexplained.com | Licensed under the GNU Free Documentation License | Site Map
This article contains text from the Wikipedia article Context-free grammar (History) and is released under the GFDL | RSS Version